Direct Candidates Generation: A Novel Algorithm for Discovering Complete Share-Frequent Itemsets

نویسندگان

  • Yu-Chiang Li
  • Jieh-Shan Yeh
  • Chin-Chen Chang
چکیده

The value of the itemset share is one way of evaluating the magnitude of an itemset. From business perspective, itemset share values reflect more the significance of itemsets for mining association rules in a database. The Share-counted FSM (ShFSM) algorithm is one of the best algorithms which can discover all share-frequent itemsets efficiently. However, ShFSM wastes the computation time on the join and the prune steps of candidate generation in each pass, and generates too many useless candidates. Therefore, this study proposes the Direct Candidates Generation (DCG) algorithm to directly generate candidates without the prune and the join steps in each pass. Moreover, the number of candidates generated by DCG is less than that by ShFSM. Experimental results reveal that the proposed method performs significantly better than ShFSM.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Fast Algorithm for Mining Share-Frequent Itemsets

Itemset share has been proposed as a measure of the importance of itemsets for mining association rules. The value of the itemset share can provide useful information such as total profit or total customer purchased quantity associated with an itemset in database. The discovery of share-frequent itemsets does not have the downward closure property. Existing algorithms for discovering share-freq...

متن کامل

Efficient Algorithms for Mining Share-frequent Itemsets

Itemset share has been proposed to evaluate the significance of itemsets for mining association rules in databases. The Fast Share Measure (FSM) algorithm is one of the best algorithms to discover all share-frequent itemsets efficiently. However, FSM is fast only when dealing with small datasets. In this study, we shall propose a revised version of FSM, called the Enhanced FSM (EFSM) algorithm ...

متن کامل

Pincer-Search: A New Algorithm for Discovering the Maximum Frequent Set

Discovering frequent itemsets is a key problem in important data mining applications, such as the discovery of association rules, strong rules, episodes, and minimal keys. Typical algorithms for solving this problem operate in a bottom-up breadth-rst search direction. The computation starts from frequent 1-itemsets (minimal length frequent itemsets) and continues until all maximal (length) freq...

متن کامل

Fast Association Rule Mining Algorithm for Spatial Gene Expression Data

One of the important problems in data mining is discovering association rules from spatial gene expression data where each transaction consists of a set of genes and probe patterns. The most time consuming operation in this association rule discovery process is the computation of the frequency of the occurrences of interesting subset of genes (called candidates) in the database of spatial gene ...

متن کامل

MINING FUZZY TEMPORAL ITEMSETS WITHIN VARIOUS TIME INTERVALS IN QUANTITATIVE DATASETS

This research aims at proposing a new method for discovering frequent temporal itemsets in continuous subsets of a dataset with quantitative transactions. It is important to note that although these temporal itemsets may have relatively high textit{support} or occurrence within particular time intervals, they do not necessarily get similar textit{support} across the whole dataset, which makes i...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005